Grepator: Accents & Case Mix for Thesaurus
نویسندگان
چکیده
There is a real need among researchers and students for pedagogical resources. In France, information retrieval techniques have been developed, for example in the Doc'CISMeF web site. As Pubmed, documents are indexed with (French) MeSH terms, one of the problems discovered, in quality studies, is the inadequacies between the user requests and the MeSH controlled vocabulary. Moreover, French (but also Greek or Spanish), pose specific problems for indexing, due to the diacritic characters.In this article, we present the Grepator project. The main goal is to transform any thesaurus (or any entry) in case mix and accentuated characters, for a specific domain. Furthermore, Grepator has to complete MeSH terms according to their usual form in natural language and finally, to correct user spelling mistakes. Grepator is based on a statistical approach. A large French medical corpus has been constituted from pedagogical resources indexed in CISMeF. Using regular expressions, Grepator searches the more usual ways to spell the word.. Seventy five percent of MeSH terms are found in the corpus, using this method, with less than one mistake for a hundred words. This first evaluation of the tools is analyzed and we discuss further steps that might be developed.
منابع مشابه
بررسی وضعیت نرمافزارهای مدیریت و ارائهی اصطلاحنامهای فارسی
The current study is devoted to investigate softwares for managing and providing Persian thesaurus. Therefore, using survey-descriptive method, we have analyzed five thesaurus management softwares, including the softwares “Islamic Sciences Thesaurus”, “Thesaurus Builder”, “Pars Azarakhsh”, “Ghamoos” and “published version of Ebrahimpoor Thesaurus”, along with four softwares for providing thesau...
متن کاملCase Mix Planning using The Technique for Order of Preference by Similarity to Ideal Solution and Robust Estimation: a Case Study
Management of surgery units and operating room (OR) play key roles in optimizing the utilization of hospitals. On this line Case Mix Planning (CMP) is normally applied to long term planning of OR. This refers to allocating OR time to each patient’s group. In this paper a mathematical model is applied to optimize the allocation of OR time among surgical groups. In addition, another technique is ...
متن کاملSome Problems with Accents in TEX: Letters with Multiple Accents and Accents Varying for Uppercase/Lowercase Letters
The problems of using the internal command \accent as a tool for support of some Cyrillic writing systems is investigated. It is shown that the internal features of \accent prevent construction of some Cyrillic letters which require several accents simultaneously. A special macro which emulates the work of \accent by some other commands is suggested. The accents for I/i and J/j, which are diffe...
متن کاملبررسی تطبیقی اصطلاحنامه معارف اسلامی و علوم قرآنی
This study examines the comparative strengths and weaknesses of the thesaurus and thesaurus Quranic teachings of the Koran. In today's society where the documents are kept electronically, retrieval and dissemination of information for the development of research, much greater importance of saving documents and thesaurus that is the basis for indexing in various sciences, One of the solutions fo...
متن کاملبررسی مقایسهای روابط معنایی، ساختار شکلی و سیستم مدیریت اصطلاحنامههای فنی ـ مهندسی و نما
Purpose: Thesauri as important tools in storage and retrieval information systems have a significant role in the optimization of database search. So the publishing of thesauri needs to use standards as much as possible. I examined and compared two important thesauruses on the basis of ANSI/NISO z39.19 2005. Methodology: This study is an analytical and applied survey. The study population was t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Studies in health technology and informatics
دوره 116 شماره
صفحات -
تاریخ انتشار 2005